320 research outputs found

    Laniakea : an open solution to provide Galaxy "on-demand" instances over heterogeneous cloud infrastructures

    Get PDF
    Background: While the popular workflow manager Galaxy is currently made available through several publicly accessible servers, there are scenarios where users can be better served by full administrative control over a private Galaxy instance, including, but not limited to, concerns about data privacy, customisation needs, prioritisation of particular job types, tools development, and training activities. In such cases, a cloud-based Galaxy virtual instance represents an alternative that equips the user with complete control over the Galaxy instance itself without the burden of the hardware and software infrastructure involved in running and maintaining a Galaxy server. Results: We present Laniakea, a complete software solution to set up a \u201cGalaxy on-demand\u201d platform as a service. Building on the INDIGO-DataCloud software stack, Laniakea can be deployed over common cloud architectures usually supported both by public and private e-infrastructures. The user interacts with a Laniakea-based service through a simple front-end that allows a general setup of a Galaxy instance, and then Laniakea takes care of the automatic deployment of the virtual hardware and the software components. At the end of the process, the user gains access with full administrative privileges to a private, production-grade, fully customisable, Galaxy virtual instance and to the underlying virtual machine (VM). Laniakea features deployment of single-server or cluster-backed Galaxy instances, sharing of reference data across multiple instances, data volume encryption, and support for VM image-based, Docker-based, and Ansible recipe-based Galaxy deployments. A Laniakea-based Galaxy on-demand service, named Laniakea@ReCaS, is currently hosted at the ELIXIR-IT ReCaS cloud facility. Conclusions: Laniakea offers to scientific e-infrastructures a complete and easy-to-use software solution to provide a Galaxy on-demand service to their users. Laniakea-based cloud services will help in making Galaxy more accessible to a broader user base by removing most of the burdens involved in deploying and running a Galaxy service. In turn, this will facilitate the adoption of Galaxy in scenarios where classic public instances do not represent an optimal solution. Finally, the implementation of Laniakea can be easily adapted and expanded to support different services and platforms beyond Galaxy

    Laniakea: a Galaxy-on-demand Provider Platform Through Cloud Technologies

    Get PDF
    Galaxy is rapidly becoming the de facto standard workflow manager for bioinformatics. Although several Galaxy public services are currently available, the usage of a private Galaxy instance is still mandatory or preferable for several use cases, including heavy workloads, data privacy concerns or particular customization needs. In this context, cloud computing technologies and infrastructures can provide a powerful and scalable solution to avoid the onerous deployment and maintenance of a local hardware and software infrastructure. Laniakea is a software framework that facilitates the provisioning of on-demand Galaxy instances as a cloud service over e-infrastructures, by leveraging on the open source software catalogue developed by the INDIGO-DataCloud H2020 project, which aimed to make cloud e-infrastructures more accessible by scientific communities. End-users interact with Laniakea through a web front-end that allows a general setup of a Galaxy instance. The deployment of the virtual hardware and of the Galaxy software ecosystem is subsequently performed by the INDIGO Platform as a Service layer. At the end of the process, the user gains access to a private, production-grade, fully customizable, Galaxy virtual instance. Laniakea features the deployment of a stand-alone or cluster backed Galaxy instances, shared reference data volumes, encrypted data volumes and rapid development of novel Galaxy flavours for specific tasks. We present here the latest development iteration of Laniakea, introducing a novel and strongly configurable web interface that facilitates a more straightforward customisation of the user experience through human readable YAML syntax and a reworked encryption procedure that exploits Hashicorp Vault as encryption keys management system

    Laniakea@ReCaS: an ELIXIR-ITALY Galaxyon-demand cloud service

    Get PDF
    Although several Galaxy public services are available, a private Galaxy instance is still mandatory or preferable for several use cases including heavy workloads, data privacy concerns or particular customization needs. Cloud computing technologies provide a viable way to deploy Galaxy private instances, freeing users from the onerous deployment and maintenance of local IT infrastructures. In the last few years, ELIXIR-IT led the development of Laniakea, a software framework that facilitates the provisioning of on-demand Galaxy instances as a cloud service over e-infrastructures. The user interacts with a Laniakea service through a web front-end that allows to configure and launch a production-grade Galaxy instance in a straightforward way. Through the interface, the user can deploy Galaxy instances over single VMs or virtual clusters, link them to shared reference data volumes and plain or encrypted volumes for storing data. A selection of \u201cflavours\u201d, that is Galaxy instances pre-configured with sets of tools for specific tasks, is also available. When the users is satisfied, Laniakea takes oved and deploys the desired Galaxy instance over the cloud, providing a public IP and full administrative privileges over the new instance. In Dec-2018, we launched the beta-test phase of the first Laniakea-based Galaxy on-demand ELIXIR-IT service: Laniakea@ReCaS. After six months of helpful testing, we are now ready to announce the production phase of this service. Access to the service will be provided on a per-project basis through an open-ended call defining terms and conditions, project proposals will be evaluated by a scientific and technical board. Accepted proposals will be granted a package of computational resources for running on-demand Galaxy instances for a duration compatible with the project requirements

    TOSCA-based orchestration of complex clusters at the IaaS level

    Full text link
    [EN] This paper describes the adoption and extension of the TOSCA standard by the INDIGO-DataCloud project for the definition and deployment of complex computing clusters together with the required support in both OpenStack and OpenNebula, carried out in close collaboration with industry partners such as IBM. Two examples of these clusters are described in this paper, the definition of an elastic computing cluster to support the Galaxy bioinformatics application where the nodes are dynamically added and removed from the cluster to adapt to the workload, and the definition of an scalable Apache Mesos cluster for the execution of batch jobs and support for long-running services. The coupling of TOSCA with Ansible Roles to perform automated installation has resulted in the definition of high-level, deterministic templates to provision complex computing clusters across different Cloud sites.The authors would like to thank the European Commission for the financial support for project INDIGO-DataCloud (RIA 653549)Caballer Fernández, M.; Donvito, G.; Moltó, G.; Rocha, R.; Velten, M. (2017). TOSCA-based orchestration of complex clusters at the IaaS level. Journal of Physics: Conference Series (Online). 898:1-8. https://doi.org/10.1088/1742-6596/898/8/082036S1889

    CMS Monte Carlo production in the WLCG computing Grid

    Get PDF
    Monte Carlo production in CMS has received a major boost in performance and scale since the past CHEP06 conference. The production system has been re-engineered in order to incorporate the experience gained in running the previous system and to integrate production with the new CMS event data model, data management system and data processing framework. The system is interfaced to the two major computing Grids used by CMS, the LHC Computing Grid (LCG) and the Open Science Grid (OSG). Operational experience and integration aspects of the new CMS Monte Carlo production system is presented together with an analysis of production statistics. The new system automatically handles job submission, resource monitoring, job queuing, job distribution according to the available resources, data merging, registration of data into the data bookkeeping, data location, data transfer and placement systems. Compared to the previous production system automation, reliability and performance have been considerably improved. A more efficient use of computing resources and a better handling of the inherent Grid unreliability have resulted in an increase of production scale by about an order of magnitude, capable of running in parallel at the order of ten thousand jobs and yielding more than two million events per day

    Laniakea: an open solution to provide Galaxy "on-demand" instances over heterogeneous cloud infrastructures

    Get PDF
    Background: Galaxy is rapidly becoming the de facto standard among workflow managers for bioinformatics. A rich feature set, its overall flexibility, and a thriving community of enthusiastic users are among the main factors contributing to the popularity of Galaxy and Galaxy based applications. One of the main advantages of Galaxy consists in providing access to sophisticated analysis pipelines, e.g., involving numerous steps and large data sets, even to users lacking computer proficiency, while at the same time improving reproducibility and facilitating teamwork and data sharing among researchers. Although several Galaxy public services are currently available, these resources are often overloaded with a large number of jobs and offer little or no customization options to end users. Moreover, there are scenarios where a private Galaxy instance still constitutes a more viable alternative, including, but not limited to, heavy workloads, data privacy concerns or particular needs of customization. In such cases, a cloud-based virtual Galaxy instance can represent a solution that overcomes the typical burdens of managing the local hardware and software infrastructure needed to run and maintain a production-grade Galaxy service. Results: Here we present Laniakea, a robust and feature-rich software suite which can be deployed on any scientific or commercial Cloud infrastructure in order to provide a "Galaxy on demand" Platform as a Service (PaaS). Laying its foundations on the INDIGO-DataCloud middleware, which has been developed to accommodate the needs of a large number of scientific communities, Laniakea can be deployed and provisioned over multiple architectures by private or public e-infrastructures. The end user interacts with Laniakea through a front-end that allows a general setup of the Galaxy instance, then Laniakea takes charge of the deployment both of the virtual hardware and all the software components. At the end of the process the user has access to a private, production-grade, yet fully customizable, Galaxy virtual instance. Laniakea's supports the deployment of plain or cluster backed Galaxy instances, shared reference data volumes, encrypted data volumes and rapid development of novel Galaxy flavours, that is Galaxy configurations tailored for specific tasks. As a proof of concept, we provide a demo Laniakea instance hosted at an ELIXIR-IT Cloud facility. Conclusions: The migration of scientific computational services towards virtualization and e-infrastructures is one of the most visible trends of our times. Laniakea provides Cloud administrators with a ready-to-use software suite that enables them to offer Galaxy, a popular workflow manager for bioinformatics, as an on-demand PaaS to their users. We believe that Laniakea can concur in making the many advantages of using Galaxy more accessible to a broader user base by removing most of the burdens involved in running a private instance. Finally, Laniakea's design is sufficiently general and modular that could be easily adapted to support different services and platforms beyond Galaxy

    Optimization of Italian CMS Computing Centers via MIUR funded Research Projects

    Get PDF
    In 2012, 14 Italian Institutions participating LHC Experiments (10 in CMS) have won a grant from the Italian Ministry of Research (MIUR), to optimize Analysis activities and in general the Tier2/Tier3 infrastructure. A large range of activities is actively carried on: they cover data distribution over WAN, dynamic provisioning for both scheduled and interactive processing, design and development of tools for distributed data analysis, and tests on the porting of CMS software stack to new highly performing / low power architectures

    Distributed Computing Grid Experiences in CMS

    Get PDF
    The CMS experiment is currently developing a computing system capable of serving, processing and archiving the large number of events that will be generated when the CMS detector starts taking data. During 2004 CMS undertook a large scale data challenge to demonstrate the ability of the CMS computing system to cope with a sustained data-taking rate equivalent to 25% of startup rate. Its goals were: to run CMS event reconstruction at CERN for a sustained period at 25 Hz input rate; to distribute the data to several regional centers; and enable data access at those centers for analysis. Grid middleware was utilized to help complete all aspects of the challenge. To continue to provide scalable access from anywhere in the world to the data, CMS is developing a layer of software that uses Grid tools to gain access to data and resources, and that aims to provide physicists with a user friendly interface for submitting their analysis jobs. This paper describes the data challenge experience with Grid infrastructure and the current development of the CMS analysis system

    MRI analysis for Hippocampus segmentation on a distributed infrastructure

    Get PDF
    Medical image computing raises new challenges due to the scale and the complexity of the required analyses. Medical image databases are currently available to supply clinical diagnosis. For instance, it is possible to provide diagnostic information based on an imaging biomarker comparing a single case to the reference group (controls or patients with disease). At the same time many sophisticated and computationally intensive algorithms have been implemented to extract useful information from medical images. Many applications would take great advantage by using scientific workflow technology due to its design, rapid implementation and reuse. However this technology requires a distributed computing infrastructure (such as Grid or Cloud) to be executed efficiently. One of the most used workflow manager for medical image processing is the LONI pipeline (LP), a graphical workbench developed by the Laboratory of Neuro Imaging (http://pipeline.loni.usc.edu). In this article we present a general approach to submit and monitor workflows on distributed infrastructures using LONI Pipeline, including European Grid Infrastructure (EGI) and Torque-based batch farm. In this paper we implemented a complete segmentation pipeline in brain magnetic resonance imaging (MRI). It requires time-consuming and data-intensive processing and for which reducing the computing time is crucial to meet clinical practice constraints. The developed approach is based on web services and can be used for any medical imaging application

    Anionic glycolipids related to glucuronosyldiacylglycerol inhibit protein kinase Akt

    Get PDF
    New glucuronosyldiacylglycerol (GlcADG) analogues based on a 2-O-\u3b2-D-glucopyranosyl-sn-glycerol scaffold and carrying one or two acyl chains of different lengths have been synthesized as phosphatidylinositol 3-phosphate (PI3P) mimics targeting the protein kinase Akt. Akt inhibitory effect of prepared compounds, was assayed using an in vitro kinase assay. The antiproliferative activity of the compounds was tested in the human ovarian carcinoma IGROV-1 cell line in which we found that two of them could inhibit proliferation, in keeping with the target inhibitory effect
    • …
    corecore